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ABSTRACT 

The   two  main  criticisms  against  Stewert  and  Love's  redundancy 
index  raised  by  Nicewander  and  Wood  are  examined.      It   is   shown  how 
Stewart  and  Love's   original  claim  that   the   index  represents    the  amount 
of   overlapping  or  "redundant"  variation  between   two  sets   of  variables 
is    justified.      It   is  also  shown  why  Nicewander  and  Wood's   assertion 
that   the   redundancy  index   is  not  equal   to   the  mean  of   the   squared 
multiple  correlations   between  a   linear  composite   of  one  set   of 
variables  and  the  elements   of  the  second  set   is   incorrect. 


Nicewander  and  Wood  (1974)  criticize  the  redundancy  index 


2 
R    first  proposed  by  Stewart  and  Love  (1968) .   They  dispute  the 
y.x 

2 
two  claims  made  by  Stewart  and  Love  that  the  R  .   represents  the 

y.x 

proportion  of  variance  of  the  variable  set  Y  predictable  from  the 
variable  set  X,  and  that  the  index  is  the  average  of  certain  squared 
multiple  correlations.   This  note  intends  to  show  that  both  claims 
are  in  fact  justified. 

Nicewander  and  Wood  (hereafter  NW)  discuss  the  first  claim  by 
deriving  the  correlations  between  the  original  variables  and  their 
respective  canonical  variates.   Thus,  for  the  X  set,  these  "loadings" 
are  computed  as 


(1)     r     =  R  c.   , 

xu       XX  1 


and,  for  the  Y  set, 


(2)     r     =  R  d. 
yvt    yy  i 


where  R   and  R   represent  the  inter-correlation  matrices  of  the  X 
xx      yy 

variables   and  the  Y  variables,    respectively.      The  vectors   c.    and  d., 
]>l,2,...,q    ,    are    the   eigenvectors  whose   elements   are    the  weights 
determining   the  X-variates    (or  u.)    and  the  Y-variates    (or  v.)>    respective 
ly.      There   are   q     variables    in   the  Y   set. 

y 


2 
To  compute  the  redundancy  index  it  is  necessary  to  first  compute 
the  sum  of  the  squared  loadings.   This  sum  becomes 
(3)     r*   r     =  c!  R'  R  c.   =  c!  R  2c . 

xu .  XU .        1   XX  XX  1       1   XX  1 

1   1 
for  the  X-set,  and 


(4)     r'   r»     =  d!R'  R  d.    =  d.'R  2d. 
yvi  yvt       l  yy  yy  i      i  yy  l 


for  the  Y-set.   Stewart  and  Love  (hereafter  SL)  define  these  sums  as  the 
variance  extracted  by  the  i '  th  variate,  i=l,2,...,q  ,  from  the  X-set 
and  Y-set,  respectively.   When  the  i ' th  sum  is  divided  by  the  number 
of  variables  in  the  set,  "the  resulting  value  is  the  proportion  of  the 
variance  in  the  set  extracted  by  that  canonical  variate"  (SL,p.l61). 

NW  argue  that  this  terminology  is  misleading.   First,  they  claim, 
all  canonical  variates  have  a  variance  of  one  since  the  usual  constraints 

(5)      c'R  c  -  d'R  d  =  1 
xx         yy 

are  in  fact  designed  to  insure  this  feature.   To  imply  that  variances 

differ  is  incorrect.   Second,  NW  argue,  the  quantities  derived  in  (3) 

2  2 

and  (4)  are  "empirically  meaningless  since  both  c.'R  c.  and  d.'R  d. 

i  xx  i      i  yy  i 

are  themselves  devoid  of  any  interpretation"  (NW,  p. 93). 

NW's  first  point  seems  to  us  to  be  of  little  relevance,  and  is 
possibly  based  upon  a  misreading  of  SL.   Clearly,  the  equalities  in 


(3)  and  (4)  are  different  from  those  of  (.5).   The  equations  in  (5)  refer 
to  the  varimces  of  certain  variabl  s;  the  equalities  of  (3)  and  (4) 
relate  to  the  covariances  between  different  variables. 

The  second  criticism  made  by  NW  partly  follows  from  the  first  and 
is  thus  also  misplaced.   In  fact,  one  reason  in  favor  of  the  SL 
approach  is  that  a  standard  procedure  in  measuring  the  "explication" 
or  "reproduction"  or  "extraction"  by  one  variable  or  a  linear 
combination  of  variables  of  the  variance  of  another  variable  is  to 
measure  their  zorrelation.   The  square  of  this  correlation  will  then 
measure  the  percentage  explanation  obtained. 

NW  should  be  given  some  credit,  however.   When  the  summing  over 
the  squared  variable  loadings  takes  place,  it  should  be  kept  in  mind 
that  these  are  not  orthogonal  for  any  one  canonical  variate,  and  thus 
speaking  of  a  "total"  amount  of  variation  explained  in  the  original 
variables  becomas  somewhat  misleading.   When  the  averaging  over  the 
number  of  variables  is  carried  out,  the  resulting  measure  simply  becomes 
the  mean  proportion  of  variance  explained  in  each  original  variable  by 
that  canonical  variate.   However,  with  the  original  variables  standard- 
ized, the  variances  all  are  one,  so  that  even  for  the  non-orthogonal 
case  the  SL  statement  quoted  above  (SL,  p. 161)  is  basically  justified. 

The  second  attack  made  by  NW  upon  SL  is  somewhat  more  substantial. 

2* 
SL  point  out  that  their  redundancy  index  R     is  equal  to  the  mean 

y .  x      ^ 

squared  multiple  correlation,  where  the  multiple  correlations  refer  to 


each  Y-variable  regressed  upon  the  whole  X-3et.   There  is  no  proof  of 
the  assertion.   NW  first  state  that  the  SL  presentation  is  not  clear, 
and  thst  there  are  two  alternative  interpretations  of  the  assertion. 
They  attempt  to  show  that  the  assertion  is  incorrect  under  either 
interpretation . 

Since  NW's  second  interpretation  represents  a  misunders tending  and 
is  thus  incorrect,  we  will  here  concentrate  upon  the  first  interpreta- 
tion relating  to  each  Y-variable  regressed  upon  the  X's.   NW  rewrites 
the  squared  multiple  correlation  between  y.,  j-l,2,...,q  ,  and  an 
optimum  linear  combination  of  the  X  set  as 

(6)      R   v  *»  r'  R~lv 

y  .  .X     xy .  xx  xy . 

where  r    is  the  j ' th  column  of  R   ,  the  matrix  of  intercorrelations 
between  the  X's  and  \  :s.   The  authors  then  state:   "Clearly,  the  avera 

of  these  squared  multiple  correlations  cannot  be  equal  to  the  redundancy 

2 
index  R      'NW3  p.  93),   we  will  s..ow  here  that  this  unproven  assertion 
>  »x 

is  in  fact  incorrect  (although  equation  (6)  is  in  itself  correct). 

Since  the  original  SL  assertion  has  not  been  rigorously  proven 

before  (although  all  empirical  results  in^iraie  they  are  correct)  it  will 

be  useful  to  fully  develop  the  necessary  algebraic  relationships.   In 

what  follows  we  will  first  establish  the  correctness  of  the  SL  assertion 

for  the  case  where  the  X  and  the  Y  matrices  are  of  the  same  rank  (q  =q  ^q) 

nv  x 


5 
Then  the  generalization  to  different  ranks,  the  number  of  X  variables 
being  greater  than  the  number  of  Y  'ariables  (q  ^q  ),  will  be 
carried  out. 

Using  NW's  notation,  the  SL  index  of  redundancy  is  calculated  as 


t 


(7)      r   z   =  I    J]     %2      (r!   r    ), 

y.x     q    i=l   l    yv.  yv. 


with  the  quantities  defined  as  before,  the  A.  denoting  the  i ' th 
canonical  correlation.   Since  the  exposition  will  be  clearer  using 
individual  correlations,  we  note  that 


2 
(8)      r'   r     «  £  r 

yv.  yv.    i-i  y .v. 
i  J    i         1  i 


To  show  that  SL's  assertion  is  true,  we  need  to  prove  the  following 

Theorem:   If  (  X |Y)  is  e    matrix  of  N  observations  on  2q  variables  with 
rank  2q  (X  being  of  order  N  by  q,  Y  of  order  N  by  q) ,  and  (U jV)  are  the 
corresponding  canonical  variates  ba^ed  on  X  and  Y,  respectively,  then 


n   q     q  '      i   9    ? 

(9)      ~  .2,   (.51  r    )       -  -  .71   R   v 
q  i-l   M-l  >       x      q  y-1   y  .X 


Proof:      We   make   use   of   the    facts    that 

1)  X  and  U   are    related  by   a   non-singular   transformation. 

2)  Similarly   for  Y   and  V. 

3)  r  =   r  =  r     „     =  0   for   i   jL-   j,   where   r.  ~   X,   i,j   =    1,2, 

u.u  v.v  uv  J'  i  v  j_ »      »  j 

i   j  x  j  l  j  Ji   i 


For  convenience,    all   variables   are   assumed   standardized 

2  2 

From    1)    it    follows    that   R        ...      =     R        TT    . 

Y4*x  Y4-U 

j  J 

2  q        2 

From  3)    it   follows    that  R       Tt  =   Z     c 

y.-IJ     •  _-,    y-u. 
q 

From  2)  it  follows  that  y .  =  Z       a. v.   ,  with 

j   •  1    i  1 

1=1 


r      -  S  a.r      =  a.  ,  using  3) 
y  .v,    .  1    i  v.v.     k       b 
y  i  k   i=l      i  k 


Similarly , 

q 

(10)     r      =  2  a.r 

y  .u,     •_-,   i  v.u. 
;j  k    i=l     i  k 


=  a.  r      ,  again  using  3)  . 

k  Vk 

2  2       q   2  q   2  2 

Combining  these  results,  we  have  R   ,,  =  R   tT  =  £  r  =   Z  a.r 

y  •  -x  v.  <U        .    n      y  u.  i   v.u. 

'j  - j             l-l     'j    i  1=1             ii 

2  q  2             2 

(11>            R       v  »   S  r            r 

y .  .X  .    -  v.v.      v.u, 

j  *-*■«  '  j    i        ii 

q  2  2 

-   2       r  A, .      . 

yv.      i 

1=1  yj      3 


Summing   over   j    and   dividing  by   q  gives    the   desired   result. 

The   generalization   of    this    result   to   the   case  where    the   rank  of   the 

X  matrix   is   q    ,    q     >  q    ,    hinges    on  whether  equality    (10)    is   still  valid, 
xx  y 

We  will   establish   that    it    is    by  showing   that    the   r  vanish   for   k  = 

'  6  v.u, 

i  k 

V1,  v2'"'*'V 


Again  adopting  NU'o  notation,  we  rewrite  their  equations  (2)  and 


(3)  at 


(12.)     -  XR  d  +  R*  c  -  0 

yy    xy 


(13)     R  d  -  XR  c    •=   0 
xy      xx 


This  is  the  system  of  equations  from  which  the  canonical  correlations 
are  derived.  As  is  well  known,  the  system  has  a  solution  only  if  the 
determinant  of  the  coefficients  equals  zero.   This  can  be  written  as 


(14) 


X  R 


yy 


R« 

xy 


R 


xy 


X  R 


xx 


=   0 


The  determinants!  equation  (14)  forms  a  polynomial  of  degree 
(q  +  q  )  in  X.   The  positive  roots  of  this  polynomial,  in  descending 

y   x 

order,  yield  the  canonical  correlations.  We  will  show  that  there  are 

at  least  (q  -q  )  zero  roots,  and  that  there  are  q   nonnegative  and  q 
x  y  y  y 

nonpositive  roots.   Of  prime  interest  in.  canonical  analysis  are  the  q 
nonnegative  roots,  generating,  as  WW  indicate,  the  canonical  correlations 

X.  3  with  corresponding  vectors  c  and  d.,  i=l,2,...,q  . 

~  j  * 

Relying  on  a  well  known  result  on  the  determinant  of  a  partitioned 
matrix  (see,  e.g.,  Dhrymes,  1970,  p.  570),  we  can  write  (14)  as 


(15) 


■X  R   I   I  -X  R   -  R'   (-X  R   )"L  R   I   -  0, 
xx l   '     yy    xy      xx     xy ' 


8 
The  validity  of  the  result  requires   [  -  XR    j  4   0  which  holds 
unless  there  is  an  exact  linear  relationship  between  some  X- 
variables.   By  factoring  out  the  appropriate  terms,  we  see  that 

(15)  can  be  written 

q  i  2  i 

(16)  (-    X)x      Ir      M-4  (   X  R        -  R'   R       R     )        -  0, 

v      '  v        J         '  xx '    '   X  yy         xy  xx     xy      ! 

and,    further, 

q +qv    %~%  2  -l 

(17)  (-1)   X     y   X  "C     y     R      !      XR        -  R'   R      Hi      I   -   0. 

1    xx '     !        yy  xy  xx     xy  * 

Here   use    is  made   of   the    property    that,    for  any   constant   [X  and  any   non- 
singular  matrix  A   of  rank  m,    | jjA  |    =   u.      |  A    |    .      From   (17)   we   see    that 
the   determinantal   equation   is    satisfied   for    (q   -q  )    zero  roots    of   X. 
The   nonzero  roots   yielding   the   usual   canonical   correlations   are    in 
fact   the   nonzero   roots    of 

(18)  |    X2R        -  R!   R   "*1R        1=0. 
1  yy         xy  xx     xy    ! 

Thus,  the  complete  set  of  correlations  are  made  up  of  the  q  roots 

extracted  from  (18)  augmented  by  the  (q  -q  )  zero  roots  from  (17)  and 

x  y 

we  have  the  desired  result: 


(19)     r     =  0,  for  all  k  =  q  +1,  q+2,...,q  . 

v.u  ^y     y       x 


Accordingly,  we  can  write  in  the  general  case 

?       x   2 
(20)     R"  rT  -  2   r 


^   2 

V    r 


i=i   i  i 


since  ,  from  (19) , 

qy 

(21)     r     =  2   a.r     =  0,  for  q  >  k  >  q  . 

j  k   i=l      i  k 


The  proof  for  the  same  rank  case  (q  =  q  =  q)  can  then  be  applied 

directly  to  the  general  case  (q  <  q  )  . 

y         x 

It   deserves    to   be   emphasized,    that    this    result    is   not    immediately 
obvious.      In  words,    it   means    that   the   canonical   variates   comprise    the 
total   amount   of  variation   in   the   X~variabl.es   which    is    relevant    to   the 
original  variation   in   the  Y-set.      All   residual,  variation   in   the  X-set 
is    orthogonal    to    the    original   Y-variables. 
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